AITopics | task requirement

Collaborating Authors

task requirement

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

19206a6ed5ed0aaeed440448dfc5cf7e-Paper-Conference.pdf

Neural Information Processing SystemsJun-15-2026, 06:26:03 GMT

LLM-agent systems often decompose high-level objectives into subtask dependency graphs, assuming that each subtask's output is reliable and conditionally independent of others given its parent responses. However, this assumption frequently breaks during execution, as ground-truth responses are inaccessible, leading to inter-agent misalignment--failures caused by inconsistencies and coordination breakdowns among agents [1]. To address this, we propose SEQCV, a dynamic framework for reliable execution under violated conditional independence. SEQCV executes subtasks sequentially, each conditioned on all prior verified responses, and performs consistency checks immediately after agents generate short token sequences. At each checkpoint, a token sequence is accepted only if it represents shared knowledge consistently supported across diverse LLM models; otherwise, it is discarded, triggering recursive subtask decomposition for finer-grained reasoning. Despite its sequential nature, SEQCV avoids repeated corrections on the same misalignment and achieves higher effective throughput than parallel pipelines. Across multiple reasoning and coordination tasks, SEQCV improves accuracy by up to 30% over existing LLM-agent systems.

artificial intelligence, large language model, natural language, (16 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
Asia (0.68)
North America > United States > California (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Consumer Products & Services (1.00)
Transportation (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

ManiDP: Manipulability-Aware Diffusion Policy for Posture-Dependent Bimanual Manipulation

Li, Zhuo, Liu, Junjia, Li, Dianxi, Teng, Tao, Li, Miao, Calinon, Sylvain, Caldwell, Darwin, Chen, Fei

arXiv.org Artificial IntelligenceOct-28-2025

Recent work has demonstrated the potential of diffusion models in robot bimanual skill learning. However, existing methods ignore the learning of posture-dependent task features, which are crucial for adapting dual-arm configurations to meet specific force and velocity requirements in dexterous bimanual manipulation. To address this limitation, we propose Manipulability-Aware Diffusion Policy (ManiDP), a novel imitation learning method that not only generates plausible bimanual trajectories, but also optimizes dual-arm configurations to better satisfy posture-dependent task requirements. ManiDP achieves this by extracting bimanual manipulability from expert demonstrations and encoding the encapsulated posture features using Riemannian-based probabilistic models. These encoded posture features are then incorporated into a conditional diffusion process to guide the generation of task-compatible bimanual motion sequences. We evaluate ManiDP on six real-world bimanual tasks, where the experimental results demonstrate a 39.33$\%$ increase in average manipulation success rate and a 0.45 improvement in task compatibility compared to baseline methods. This work highlights the importance of integrating posture-relevant robotic priors into bimanual skill diffusion to enable human-like adaptability and dexterity.

artificial intelligence, machine learning, manidp, (18 more...)

arXiv.org Artificial Intelligence

2510.23016

Country: Asia > China (0.47)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.48)

Add feedback

CodeVisionary: An Agent-based Framework for Evaluating Large Language Models in Code Generation

Wang, Xinchen, Gao, Pengfei, Peng, Chao, Hu, Ruida, Gao, Cuiyun

arXiv.org Artificial IntelligenceOct-21-2025

Abstract-- Large language models (LLMs) have demonstrated strong capabilities in code generation, underscoring the critical need for rigorous and comprehensive evaluation. Existing evaluation approaches fall into three categories, including human-centered, metric-based, and LLM-based. Considering that human-centered approaches are labour-intensive and metric-based ones overly rely on reference answers, LLM-based approaches are gaining increasing attention due to their stronger contextual understanding capabilities. However, they generally evaluate the generated code based on static prompts, and tend to fail for complex code scenarios which typically involve multiple requirements and require more contextual information. In addition, these approaches lack fine-grained evaluation for complex code, resulting in limited explainability. T o mitigate the limitations, we propose CodeVisionary, the first agent-based evaluation framework for complex code generation. CodeVisionary consists of two stages: (1) Requirement-guided multi-dimensional context distillation stage, which first formulates a detailed evaluation plan by decomposing task requirements, and then stepwise collects multi-dimensional contextual information for each requirement. A comprehensive evaluation report is also generated for enhanced explainability. For validation, we construct a new benchmark consisting of 363 samples spanning 37 coding scenarios and 23 programming languages. Extensive experiments demonstrate that CodeVisionary achieves the best performance among three baselines for evaluating complex code generation, outperforming the best baseline with average improvements of 0.217, 0.163, and 0.141 in Pearson, Spearman, and Kendall-T au coefficients, respectively. With the rapid development of large language models (LLMs), these models have demonstrated promising results in code generation [1], [2].

codevisionary, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2504.13472

Country:

North America > United States (0.93)
Asia > China (0.93)
Europe > Austria > Vienna (0.14)

Genre: Research Report (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

AniME: Adaptive Multi-Agent Planning for Long Animation Generation

Zhang, Lisai, Xu, Baohan, Yang, Siqian, Yin, Mingyu, Liu, Jing, Xu, Chao, Wang, Siqi, Wu, Yidi, Hong, Yuxin, Zhang, Zihao, Liang, Yanzhang, Jiang, Yudong

arXiv.org Artificial IntelligenceOct-13-2025

We present AniME, a director-oriented multi-agent system for automated long-form anime production, covering the full workflow from a story to the final video. The director agent keeps a global memory for the whole workflow, and coordinates several downstream specialized agents. By integrating customized Model Context Protocol (MCP) with downstream model instruction, the specialized agent adaptively selects control conditions for diverse sub-tasks. AniME produces cinematic animation with consistent characters and synchronized audio visual elements, offering a scalable solution for AI-driven anime creation.

agent, artificial intelligence, specialized agent, (14 more...)

arXiv.org Artificial Intelligence

2508.18781

Country:

Asia > China (0.17)
North America > United States (0.14)

Genre: Workflow (0.72)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

InfiAgent: Self-Evolving Pyramid Agent Framework for Infinite Scenarios

Yu, Chenglin, Yu, Yang, Wang, Songmiao, Wang, Yucheng, Yang, Yifan, Li, Jinjia, Li, Ming, Yang, Hongxia

arXiv.org Artificial IntelligenceOct-1-2025

Large Language Model (LLM) agents have demonstrated remarkable capabilities in organizing and executing complex tasks, and many such agents are now widely used in various application scenarios. However, developing these agents requires carefully designed workflows, carefully crafted prompts, and iterative tuning, which requires LLM techniques and domain-specific expertise. These handcrafted limitations hinder the scalability and cost-effectiveness of LLM agents across a wide range of industries. To address these challenges, we propose InfiA-gent, a Pyramid-like DAG-based Multi-Agent Framework that can be applied to infinite scenarios, which introduces several key innovations: a generalized "agent-as-a-tool" mechanism that automatically decomposes complex agents into hierarchical multi-agent systems; a dual-audit mechanism that ensures the quality and stability of task completion; an agent routing function that enables efficient task-agent matching; and an agent self-evolution mechanism that autonomously restructures the agent DAG based on new tasks, poor performance, or optimization opportunities. Furthermore, InfiAgent's atomic task design supports agent parallelism, significantly improving execution efficiency. Evaluations on multiple benchmarks demonstrate that InfiAgent achieves 9.9% higher performance compared to ADAS (similar auto-generated agent framework), while a case study of the AI research assistant InfiHelper shows that it generates scientific papers that have received recognition from human reviewers at top-tier IEEE conferences. The rapid development of large-scale language models (LLMs) has ushered in a new era of intelligent automation (Naveed et al., 2025; Tran et al., 2025), with agent-based systems demonstrating remarkable capabilities in organizing and executing complex tasks across domains. From scientific research and software development to creative content generation and business process automation, LLM agents are transforming how we solve problems at scale. However, the development and deployment of these agents face significant challenges, limiting their widespread adoption and effectiveness. Current approaches to building LLM agents rely heavily on carefully designed workflows, carefully crafted prompts, and extensive iterative tuning--processes that require deep LLM expertise and domain-specific knowledge (V eeramachaneni, 2025; Guo et al., 2024; Annam et al., 2025; Schick et al., 2023). This reliance on handcrafted solutions creates a fundamental scalability barrier: each new application requires significant manual intervention, making it difficult to rapidly deploy agents across diverse industries and use cases.

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2509.22502

Country: North America > United States > Minnesota (0.28)

Genre:

Research Report (1.00)
Overview (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Inference-stage Adaptation-projection Strategy Adapts Diffusion Policy to Cross-manipulators Scenarios

Yao, Xiangtong, Zhou, Yirui, Meng, Yuan, Liu, Yanwen, Dong, Liangyu, Zhang, Zitao, Bing, Zhenshan, Huang, Kai, Sun, Fuchun, Knoll, Alois

arXiv.org Artificial IntelligenceSep-16-2025

Diffusion policies are powerful visuomotor models for robotic manipulation, yet they often fail to generalize to manipulators or end-effectors unseen during training and struggle to accommodate new task requirements at inference time. Addressing this typically requires costly data recollection and policy retraining for each new hardware or task configuration. To overcome this, we introduce an adaptation-projection strategy that enables a diffusion policy to perform zero-shot adaptation to novel manipulators and dynamic task settings, entirely at inference time and without any retraining. Our method first trains a diffusion policy in SE(3) space using demonstrations from a base manipulator. During online deployment, it projects the policy's generated trajectories to satisfy the kinematic and task-specific constraints imposed by the new hardware and objectives. Moreover, this projection dynamically adapts to physical differences (e.g., tool-center-point offsets, jaw widths) and task requirements (e.g., obstacle heights), ensuring robust and successful execution. We validate our approach on real-world pick-and-place, pushing, and pouring tasks across multiple manipulators, including the Franka Panda and Kuka iiwa 14, equipped with a diverse array of end-effectors like flexible grippers, Robotiq 2F/3F grippers, and various 3D-printed designs. Our results demonstrate consistently high success rates in these cross-manipulator scenarios, proving the effectiveness and practicality of our adaptation-projection strategy. The code will be released after peer review.

artificial intelligence, gripper, optimization problem, (14 more...)

arXiv.org Artificial Intelligence

2509.11621

Country: Asia > China (0.46)

Genre: Research Report > New Finding (0.54)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

Add feedback

LinguaSafe: A Comprehensive Multilingual Safety Benchmark for Large Language Models

Ning, Zhiyuan, Gu, Tianle, Song, Jiaxin, Hong, Shixin, Li, Lingyu, Liu, Huacan, Li, Jie, Wang, Yixu, Lingyu, Meng, Teng, Yan, Wang, Yingchun

arXiv.org Artificial IntelligenceAug-28-2025

The widespread adoption and increasing prominence of large language models (LLMs) in global technologies necessitate a rigorous focus on ensuring their safety across a diverse range of linguistic and cultural contexts. The lack of a comprehensive evaluation and diverse data in existing multilingual safety evaluations for LLMs limits their effectiveness, hindering the development of robust multilingual safety alignment. To address this critical gap, we introduce LinguaSafe, a comprehensive multilingual safety benchmark crafted with meticulous attention to linguistic authenticity. The LinguaSafe dataset comprises 45k entries in 12 languages, ranging from Hungarian to Malay. Curated using a combination of translated, transcreated, and natively-sourced data, our dataset addresses the critical need for multilingual safety evaluations of LLMs, filling the void in the safety evaluation of LLMs across diverse under-represented languages from Hungarian to Malay. LinguaSafe presents a multidimensional and fine-grained evaluation framework, with direct and indirect safety assessments, including further evaluations for oversensitivity. The results of safety and helpfulness evaluations vary significantly across different domains and different languages, even in languages with similar resource levels. Our benchmark provides a comprehensive suite of metrics for in-depth safety evaluation, underscoring the critical importance of thoroughly assessing multilingual safety in LLMs to achieve more balanced safety alignment. Our dataset and code are released to the public to facilitate further research in the field of multilingual LLM safety.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2508.12733

Country: Asia > Middle East (0.28)

Genre:

Research Report > New Finding (0.67)
Overview > Growing Problem (0.48)

Industry:

Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Group Consensus-Driven Auction Algorithm for Cooperative Task Allocation Among Heterogeneous Multi-Agents

Wang, Gang, Han, Hongfang, Liu, Xiaowei, Jiang, Hanfeng, Zhang, Ming

arXiv.org Artificial IntelligenceAug-5-2025

In scenarios like automated warehouses, assigning tasks to robots presents a heterogeneous multi-task and multi-agent task allocation problem. However, existing task allocation study ignores the integration of multi-task and multi-attribute agent task allocation with heterogeneous task allocation. In addition, current algorithms are limited by scenario constraints and can incur significant errors in specific contexts. Therefore, this study proposes a distributed heterogeneous multi-task and multi-agent task allocation algorithm with a time window, called group consensus-based heterogeneous auction (GCBHA). Firstly, this method decomposes tasks that exceed the capability of a single Agent into subtasks that can be completed by multiple independent agents. And then groups similar or adjacent tasks through a heuristic clustering method to reduce the time required to reach a consensus. Subsequently, the task groups are allocated to agents that meet the conditions through an auction process. Furthermore, the method evaluates the task path cost distance based on the scenario, which can calculate the task cost more accurately. The experimental results demonstrate that GCBHA performs well in terms of task allocation time and solution quality, with a significant reduction in the error rate between predicted task costs and actual costs.

agent, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2508.02015

Country: Asia > China (0.46)

Genre: Research Report > New Finding (0.48)

Industry: Transportation (0.69)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.34)

Add feedback

Chain-of-Trust: A Progressive Trust Evaluation Framework Enabled by Generative AI

Zhu, Botao, Wang, Xianbin, Zhang, Lei, Xuemin, null, Shen, null

arXiv.org Artificial IntelligenceAug-4-2025

In collaborative systems with complex tasks relying on distributed resources, trust evaluation of potential collaborators has emerged as an effective mechanism for task completion. However, due to the network dynamics and varying information gathering latencies, it is extremely challenging to observe and collect all trust attributes of a collaborating device concurrently for a comprehensive trust assessment. In this paper, a novel progressive trust evaluation framework, namely chain-of-trust, is proposed to make better use of misaligned device attribute data. This framework, designed for effective task completion, divides the trust evaluation process into multiple chained stages based on task decomposition. At each stage, based on the task completion process, the framework only gathers the latest device attribute data relevant to that stage, leading to reduced trust evaluation complexity and overhead. By leveraging advanced in-context learning, few-shot learning, and reasoning capabilities, generative AI is then employed to analyze and interpret the collected data to produce correct evaluation results quickly. Only devices deemed trustworthy at this stage proceed to the next round of trust evaluation. The framework ultimately determines devices that remain trustworthy across all stages. Experimental results demonstrate that the proposed framework achieves high accuracy in trust evaluation.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/MNET.2025.3582407

2506.1713

Country: North America > Canada (0.28)

Genre: Research Report (0.70)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.72)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.63)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.63)

Add feedback

Hybrid Voting-Based Task Assignment in Modular Construction Scenarios

Weiner, Daniel, Korpan, Raj

arXiv.org Artificial IntelligenceMay-20-2025

Modular construction, involving off-site prefabrication and on-site assembly, offers significant advantages but presents complex coordination challenges for robotic automation. Effective task allocation is critical for leveraging multi-agent systems (MAS) in these structured environments. This paper introduces the Hybrid Voting-Based Task Assignment (HVBTA) framework, a novel approach to optimizing collaboration between heterogeneous multi-agent construction teams. Inspired by human reasoning in task delegation, HVBTA uniquely integrates multiple voting mechanisms with the capabilities of a Large Language Model (LLM) for nuanced suitability assessment between agent capabilities and task requirements. The framework operates by assigning Capability Profiles to agents and detailed requirement lists called Task Descriptions to construction tasks, subsequently generating a quantitative Suitability Matrix. Six distinct voting methods, augmented by a pre-trained LLM, analyze this matrix to robustly identify the optimal agent for each task. Conflict-Based Search (CBS) is integrated for decentralized, collision-free path planning, ensuring efficient and safe spatio-temporal coordination of the robotic team during assembly operations. HVBTA enables efficient, conflict-free assignment and coordination, facilitating potentially faster and more accurate modular assembly. Current work is evaluating HVBTA's performance across various simulated construction scenarios involving diverse robotic platforms and task complexities. While designed as a generalizable framework for any domain with clearly definable tasks and capabilities, HVBTA will be particularly effective for addressing the demanding coordination requirements of multi-agent collaborative robotics in modular construction due to the predetermined construction planning involved.

agent, artificial intelligence, hvbt, (15 more...)

arXiv.org Artificial Intelligence

2505.13278

Country: North America > United States (0.05)

Genre: Research Report (0.70)

Industry: Construction & Engineering (0.49)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback